Topic Modeling of Phonetic Latin-Spelled Arabic for the Relative Analysis of Genre-Dependent and Dialect-Dependent Variation

نویسندگان

  • Ali Sakr
  • Mark Hasegawa-Johnson
چکیده

We demonstrate a data collection and analysis system that can be used to analyze the relative contributions of dialect dependent variation in the lexical of speech-like Arabic text. We utilize Latent Dirichlet Allocation (LDA), a generative Probabilistic modeling method, to analyze a phonetic Latin Spelled Arabic online chat corpus. The corpus produces different word choices and word relations based on Dialect, which can therefore aid in producing written forms of Arabic Dialects despite the large difference between Standard Written Arabic and the many Arabic Dialects.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Borrowing the Verb “ast” and Its Varieties in Arabic Dialect of Sarab

“Borrowing” is a lingual process that is studied in diachronic linguistics. In this process a language borrows elements from another language. This process usually occurs in areas that two languages make contact with each other. In a dialect spoken in South Khorasan the language borrowing happens. Arabs living in this part of Iran probably have immigrated in the early centuries of Islam. In thi...

متن کامل

The Status of [h] and [ʔ] in the Sistani Dialect of Miyankangi

The purpose of this article is to determine the phonemic status of [h] and [ʔ] in the Sistani dialect of Miyankangi. Auditory tests applied to the relevant data show that [ʔ] occurs mainly in word-initial position, where it stands in free variation with Ø. The only place where [h] is heard is in Arabic and Persian loanwords, and only in the pronunciation of some speakers who are educated and/or...

متن کامل

Modeling and Non-modeling Genre-based Approach to Writing Argument-led Introduction Paragraphs: A Case of English Students in Iran

Despite the crucial role of introductory sections in argumentative academic writing, the effects of genre- based approaches to writing introductory paragraphs have not been much explored yet. The present study aimed to investigate whether the provision of genre knowledge through modeling and non-modeling could enhance learners’ ability in writing introductory paragraphs of argumentative essays....

متن کامل

Multidialectal Spanish acoustic modeling for speech recognition

During the last years, language resources for speech recognition have been collected for many languages and specifically, for global languages. One of the characteristics of global languages is their wide geographical dispersion, and consequently, their wide phonetic, lexical, and semantic dialectal variability. Even if the collected data is huge, it is difficult to represent dialectal variants...

متن کامل

Mathematical Modeling of the Temperature-Dependent Growth of Living Systems

In this investigation a non-equilibrium thermodynamic model of the temperature dependent biological growth of a living systems has been analyzed. The results are derived on the basis of Gompertzian growth equation. In this model, we have considered the temperature dependent growth rate and development parameter. The non-equilibrium thermodynamic model is also considered for exploring the variat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012